Large Language Models (LLMs)

Large Language Models (LLMs)

LLMs data ingest

Large Language Models (LLMs)

  • Example LLMs: ChatGPT, Gemini, Llamma, Claude, capable of understanding and generating human-like language, images, music.

  • Data scope: Trained on massive datasets, including PetaBytes of internet text, Wikipedia and Pubmed.

  • Applications: LLMs are utilized in chatbots, text generation, reasoning and problem solving, creative output.

Artificial Neural Networks

GPTs

Artificial Neural Networks

  • Scalability: LLMs are based on Generative Pre-trained Transformers (GPTs), can be "prompt-engineered" for complex tasks.

  • Definition: Artificial Neural Networks (ANNs), are fundamentally complex non-linear function estimators - pattern classifiers.

  • Innovation: GPTs are ANNs that implement "multi-head attention", enabling capture of long-range patterns in training data, emergent "intelligence".

Generation and Inference

FDA LLMs

Generation and Inference

  • Text Generation: LLMs create responses by predicting likely sequences of words, based on billions of probabilities.

  • Inference Techniques: The models use sophisticated algorithms to generate text that aligns with context and user input.

  • Diversity: Can produce a wide range of responses, from factual information to hallucinations and "deepfakes".

Fine-Tuning and Adaptation

Fine tuned

Fine-Tuning and Adaptation

  • Fine-Tuning: LLMs can be fine-tuned with data from specific domains, enhancing their relevance and performance.

  • Task-Specific: Fine-tuning produces tailored AI models for specialized applications, i.e. bioinformatics / biomedical research.

  • Alternative: Fine-tuning costs computing time, instead similar achievements via carefully designed prompt-engineering.

OpenAI Assistants

OpenAI Assistants

OpenAI Assistants

  • Clarification: OpenAI (the company behind ChatGPT) offers rich functionality through their API

  • Assistants: user file search and code Interpreter, external API function calling by the AI

  • Functionality: build custom AI applications around user’s data

GPT application for BCO

OpenAI playground 2

Promt Engineering for BCO

OpenAI playground 1

Example publication

DUF1220

BCOs via GPT

Description domain populating v2

BCOs via GPT

Execution domain populating v2

BCOs via GPT

Parametric domain populating v2

Summary & Conclusions

  • Strong NLP capabilities of GPTs, good results.

  • Iterative training, prompting with canonical BCO.

  • Fine tuning with BCO json - text chunks dataset.

Thank you !